A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese
نویسندگان
چکیده
This paper presents a set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese. A large speech corpus produced by a single speaker is used, and the speech output is synthesized from waveform units of variable lengths, with desired linguistic properties, retrieved from this corpus. Detailed methodologies were developed for designing “phonetically rich” and “prosodically rich” corpora by automatically selecting sentences from a large text corpus to include as many desired phonetic combinations and prosodic features as possible. Automatic phonetic labeling with iterative correction rules and automatic prosodic labeling with a multi-pass top-down procedure were also developed such that the labeling process for the corpora can be completely automatic. Hierarchical prosodic structure for an arbitrary desired text sentence is then generated based on the identification of different levels of break indices, and the prosodic feature sets and appropriate waveform units are finally selected and retrieved from the corpus, modified if necessary, and concatenated to produce the output speech. The special structure of Mandarin Chinese has been carefully considered in all these technologies, and preliminary assessments indicated very encouraging synthesized speech quality.
منابع مشابه
A Unit Selection-based Speech Synthesis Approach for Mandarin Chinese
The paper presents a unit selection-based speech synthesis approach for mandarin Chinese. Unit selection-based approach generates speech by selecting proper units from a speech corpus and connecting them together. In this approach, a set of features are defined to describe the speech units in the corpus and the expected units in the synthesized utterance. Based on the features, cost function is...
متن کاملCharacteristics of Chinese language models for large vocabulary telephone speech
This paper is concerned with language modeling (LM) for large vocabulary speech recognition in Mandarin Chinese. As the language characteristics of Chinese are quite unique, we investigate some novel techniques in language modeling. We also borrow some of techniques that have been applied to other languages. Experiments have been conducted on the Call Home Mandarin, HUB4, and HUB5 corpora obtai...
متن کاملStatistical Analysis of Mandarin Acoustic Units and Automatic Extraction of Phonetically Rich Sentences Based Upon a very Large Chinese Text Corpus
Automatic speech recognition by computers can provide humans with the most convenient method to communicate with computers. Because the Chinese language is not alphabetic and input of Chinese characters into computers is very difficult, Mandarin speech recognition is very highly desired. Recently, high performance speech recognition systems have begun to emerge from research institutes. However...
متن کاملMultilingual Speech Corpora for TTS System Development
In this paper, four speech corpora collected in the Speech Lab of NCTU in recent years are discussed. They include a Mandarin treebank speech corpus, a Min-Nan speech corpus, a Hakka speech corpus, and a Chinese-English mixed speech corpus. Currently, they are used separately to develop a corpus-based Mandarin TTS system, a Min-Nan TTS system, a Hakka TTS system, and a Chinese-English bilingual...
متن کاملConcatenative Mandarin Tts Accommodating Isolated English Words
An experiment to explore the method realizing a concatenative Chinese TTS accommodating isolated English words is presented. The experiment was based on an existing concatenative Mandarin TTS system, developed in Motorola China Research Center. The experimental system employs an English word synthesizer based on the concatenation of speech segments stored in an English corpus. The original Engl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 10 شماره
صفحات -
تاریخ انتشار 2002